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ABSTRACT 

This review of psychometric research in reading 
analyzes the factors which seem related to reading comprehension 
skills. Experimental analysis of reading comprehension by L. E. 
Thorndike revealed two major components: knowledge of word meanings 
and verbal reasoning abilities. Subsequent analysis of experimental 
studies of reading comprehension confirmed Thorndike's conclusions 
and added the skills of (1) obtaining literal sense meaning from a 
passage, (2) following the structure (syntax) of the passage, and (3) 
recognizing the literary techniques used by an author. Other tests of 
reading speed and comprehension also confirm these conclusions. 
Statistical techniques of substrata analysis and regression analysis 
are criticized for their lack of validity and their misleading 
conclusions. Thorndike’s conclusions are pronounced confirmed and 
sound, and suggestions are made for applications of these conclusions 
to techniques and materials for reading instruction. References are 
cited. (AL) 
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in the publication of hundreds of studies involving the measurement of 
various aspects of reading# In this paper , I shall limit myself to a relatively 
few studies of the process of comprehension in reading and shall try to point 
out some of the practical consequences of those studies. 

The first systematic experimental analysis of comprehension was reported 
in 1917 by Professor Edward Lee Thorndike (1917a, 1917b, 1917c.)# He presented 
short paragraphs - to elementary-school pupils and asked them to write answers to 
simple questions based on those paragraphs. The pupils were given unlimited 
time and allowed to refer to the paragraphs as often as they wished while they 
were composing or writing their responses. Thorndike found that, even when 
the pupils understood the meanings of the individual words or phrases in a 
paragraph, many of them made errors in answering the questions about it. He 
carefully classified che responses of the children and analyzed the nature of 
the errors that they mac3e. The resulting data led him to conclude that the 
pupils were unable to fit together the separate ideas expressed in a paragraph 
and to give individual words or separate word groups the proper amount of 
emphasis in relation to one another. For example, the pupils were unable to 
use connective words or phrases (such as "but” or "on the contrary") to link 
ideas together in the proper relationships. He wrote: 

Understanding a paragraph is like solving a problem in mathematics. 

It consists in selecting the right elements of the situation and putting 
them together in the right relations, and also with the right amount of 
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weight or influence or force for each (1917b, p. 329). 

Understanding a ... printed paragraph in then a matter of habits, 
connections, mental bonds, but these have to be selected from so many 
others, and given weights so delicately, and used together in so elaborate 
an organization that "to read" means "to think" as truly as does "to 
evaluate" or "to invent" or "to demonstrate" or "to verify" (1917c, p. 114). 
The last quotation formed the kernel of a vast literature on the teaching 
of reading as a process of thinking. It should be noted, however, that in 
his analysis of comprehension Thorndike fully recognized the importance of 
association, or memory, in evoking meanings attached to words or phrases as 
well as the importance of reasoning with these to achieve understanding of 
ideas expressed by an author. We may say that, in his informal "factor analysis" 
of reading, Thorndike identified two major components: (1) Knowledge of word 

meanings, and (2) Verbal reasoning ability. As a practical consequence, the 
teaching of reading came more and more to emphasize understanding and less and 
less to emphasize word calling; that is, decoding words by pronouncing them 
correctly regardless of whether they had meaning for the "reader." 

One of the first factor analyses of reading tests was reported by Feder 
(1938). He analyzed scores from tests, originally devised by Adler (1936), 
that included acquisition of facts, drawing inferences, appreciating passages, 
and speed of reading easy materials. He found that the factual-information 
tests loaded heavily on a factor different from the inference tests. The 
appreciation test appeared to be most closely associated with the inference 
tests and the speed -of-reading test appeared to measure functions different 
from the others. 
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In 1939, Davis (1940) compiled a list of hundreds of skills of comprehension 
in reading that had been suggested by previous investigators. Once the skills 
that appeared to be highly overlapping had been combined, the outline of 
skills of comprehension that is shown in Table 1 emerged. This list reflects 
rather plainly the influence of an experiment performed by I. A. Richards 
pertaining to the comprehension of poetry. These data and their implications 
were discussed in Practical Criticism by Richards (1929). Davis's (1940) outline 
formed the list of criterion skills measured by the Cooperative Reading Comprehension 
Tests, Forms Q through Z (Davis, et al., 1940-1950). In practice, two of the 
skills did not lend themselves to measurement with multiple-choice items and 
had to be dropped. The remainder were grouped for convenience into nine skills 
and are so identified in Table 1. Every item in the twelve forms of the 
Cooperative Reading Comprehension Tests was classified as measuring principally 
one of these skills. . The validity of the tests was thus established in the 
same way as the validity of criterion-referenced tests, as they are often called 
nowadays . 

About fifteen years later, the Taxonomy of Educational Objectives (Bloom, et 
al., 1956) listed many of the same skills, though under different names, as 
major objectives in the cognitive domain. To show the resemblance of the two 
sets of objectives in comprehension, the Bloom categories are cross-referenced 
to the Davis list of nine skills in Table 1. 

Davis (1941, 1944) published the results of a principal-components analysis 
of a matrix of variances and covariances obtained by administering the nine 
basic skills of comprehension, as measured by Items in the lower and higher levels 
of Form Q of the Cooperative Reading Comprehension Tests, to 421 college freshmen. 
Because the items had been constructed with the intention of making them measure 
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as separately as possible the various skills of comprehension listed in Table 1, 
the analysis was done in such a way as to retain In the results the unique 
elements, if any, of the skills measured. As is well known, five or six of 
the skills were found to have unique nonchance elements at a rather high level of 
statistical significance. The practical significance of the findings was that 
comprehension should not be regarded as a unitary trait. Consequently, the 
teaching of reading should presumably involve systematic instruction in various 
skills and any assessment of level of comprehension should involve the measurement 
of these different skills in combination. As a matter of fact, tests of 
comprehension in reading constructed since 19*0 have tended to reflect these 
findings. 

Although Davis’s 1941 experimental findings were not designed to ascertain 
the relative importance of different skills in comprehension, the results 
suggested that knowledge of word meanings and reasoning in reading were of 
greatest importance. Three or four additional skills were shown to be preoent in 
amounts that could not readily be explained by chance but they are apparently 
of far less general utility in the reading of materials customarily encountered 
by pupils in grades 7-12. 

It should be noted that Davis’s rather extensive experimental study essentially 
confirmed the insightful conclusions of Professor E. L. Thorndike in 1917 that 
(1) Knowledge of word meanings and (2) Reasoning with these meanings are the 
major components of comprehension. To these were added t (3) Getting the 
literal sense meaning of a passage; (4) Following the structure (the syntax) 
of a passage; and (5) Recognizing the literary techniques used by an author. 

To check up on these conclusions, Davis (1968) conducted a much more 
elaborate study that was published in full in the summer issue of the Reading 
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Research Quarterly. This made rather difficult reading for laymen, but its 



results are shown in a simple way in Table 2. Here we see two estimates of the 
per cent of the true variance of each of eight comprehension skills that are 
unique. It will be noted that, in a sample of 988 high-school seniors, the largest 
of these are; 

1. Recalling word meanings; 

2 . Drawing inferences from the content (reasoning) ; 

3. Following the structure of a passage; 

4. Recognizing a writer’s purpose, attitude, tone, and mood; 

5. Finding answers to questions answered explicitly or in paraphrase; 
that is, getting the literal sense meaning. 

Four of these were judged significant in Davis's earlier study (Davis, 1941, 
1944). Thus, considerable overlap in the results is shown despite the fact 
that different passages and items, different techniques of analysis, and examinees 
at different grade levels were used in the 1941 and 1968 studies. A component 
analysis of the 1968 data, first published in 1971 in The Literature of Research 
in Reading, With Emphasis on Models (Davis, 1971), confirms Components I and V 
of the 1941 study and splits Component II of that study into two separate factors. 
The first of these emphasizes Making inferences from the content (Skill 5) and 
the second emphasizes Finding answers to questions answered explicitly or in 
paraphrase (Skill 3) and Weaving together ideas in the content (Skill 4). This 
material is scheduled to appear in the 1972 Summer issue of the Reading Research 
Quarterly . 

As you know, research methodologists refer to a conceptualization of a 
phenomenon (its elements and their inter-relations as a working model of the 
phenomenon) as a model. Davis's outline in 1939 constituted a model of the 
comprehension process in reading. Since that time, Davis has tested the 
separate reality (or uniqueness) of its elements and their relative importance 
in the process of comprehension among secondary-school and college 



- 6 - 



students. Only one other model, or conceptualization of the reading process, 
has been subjected to experimental verification. This is Holmes’s suostrata- 
f actor theory of reading (Holmes, 1948, 1954), which will be discussed next. 

At this point it may be noted that arm-chair models or partial models of 
the comprehension process in reading that have not been subjected to comprehensive 
experimental verification include those by Albright (1927); Barrett (1968, 
pp. 19-23); Barton (1930); Berry (1931); Bloom, et al. (1956); Cleland 
(1965); Gates (1935); Gray (1919, 1960); Kingston (1961); Robinson (1966); 
Smith (1960); Spache (1962, 1963); Strang (1938); and 'ioakum (1928). At 
present. Chapman (1969) is engaged in experimental verification of three models 
that she has proposed. 

Holmes suggested (1948, 1954) the use of multiple-regression analysis to 
identify the relative contributions of a wide selection of variables to the 
variance of speed and power of reading. Experiments to carry out these proposals 
were conducted by Holmes (1948), by Singer (1965a, 1965b), and by Holmes and 
Singer (1966). These experiments were designed to test only a limited part of 
the so-called substrata-f actor theory of reading, a rather diffuse statement 
which would be difficult to put to experimental test. 

By "power of reading" Holmes (1948) meant the ability to comprehend rather 
difficult textbook material in generous time limits. This variable has commonly 
been called "level of comprehension" and will be so referred to in this paper. 
Holmes measured it in his 1948 experiment, which involved the analysis of 
data based on testing 126 college students, by using the sum of comparable 
scores on five untimed comprehension subtests in the Diagnostic Examination 
of Silent Reading Abilities (Dvorak & Van Wagenen, 1939). 

The criterion variable of speed of reading was obtained by averaging standard 
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scores on the Minnesota Speed of Reading Test for College Students (Eurich, 

1936 ) and the Rate of Comprehension Test of the Diagnostic Examination of Silent 
Reading Abilities, Both of these tests measure the ability of an examinee to 
detect an absurd word inserted near the end of a rather short, easy passage. 

This ability is not called into play in natural reading situations and may 
lead test -vise examinees to alter their normal reading habits to an unacceptable 
degree. In any case, high scores on these tests are obtained by marking items 
correctly as rapidly as possible during a short time limit. Therefore, both 
comprehension and speed of covering material are measured. This variable is 
commonly referred to as "speed of comprehension" and will be so referred to in 
this paper to distinguish it from mere speed of reading (in number of words 
covered per minute, for example). Unfortunately, the validity of this criterion 
variable is open to question because it is not ordinarily called into play in 
natural reading situations and may lead test-wise examinees to alter their normal 

reading habits to an unacceptable degree. 

The predictor variables, for which it was desired to obtain estimates of the 
proportion of the variances of "level of comprehension" and of "speed of 
comprehension" that they constitute, consisted of 20 measures that had correlations 
with the two criterion variables such that they gave preliminary promise of 
making independent contributions to the variance of level of comprehension and 
speed of comprehension. When the partial regression coefficients were obtained, 
the level -of -comprehension criterion score was added to the 20 predictors of 
speed of comprehension; vice versa, the speed-of -comprehension criterion score 
was added to the 20 predictors of level of comprehension. The Wherry-Doolittle 
procedure was used to obtain partial regression coefficients for only the four 
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variables that contributed most to the variance of the two criteria. This . 
limitation may have been wise, considering that there were only 126 examinees 
in a study involving 21 predictors. But more meaningful results could have 
been obtained if a large number of examinees had. been used and if all 21 
partial regression coefficients per criterion variable had been obtained. 

In any event, the results showed that somewhat over half of the variance of 
speed of comprehension was attributable to three measures: tests of accuracy 

and speed of word perception and word meaning, and recognition span (a measure 
obtained by eye-movement photography). 

About 80 per cent of the variance of level of comprehension was attributable 
to measures of level of vocabulary, reasoning ability (as measured by the Otis 
Quick-Scoring Mental-Ability Tests), verbal relationships, find the number of 
fixations per 100 running words. The latter has a negative sign on the 
regression coefficient, indicating that the smaller the number of fixations 
per 100 words, the greater the examinee's level of comprehension tends to be. 

(The zero-order correlation coefficient between these two variables was 
actually .10, which is not significantly different from zero.) This analysis 
provides a very limited amount of information about the underlying elements 
of speed of comprehension by suggesting that the latter is based largely on 
accuracy and speed of word perception and on level of vocabulary. Generally 
speaking, it is believed that accuracy of eye movements and span of recognition 
are influenced by the comprehension levels of individuals rather than the 
reverse. 

With respect to comprehension, the analysis appears to reiterate the Thorndike 
and Davis conclusions that knowledge of word meanings and verbal reasoning are 
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the two major components of comprehension in reading. 

The reader familiar with Holmes’s practice of using the most important 
predictor variables as the criteria for regression analyses that use the other 
variables in the set as predictors may wonder why I have not discussed these 
so-called substrata analyses in the interpretation of Holmes’s study. The 
answer lies in the fact that, legitimate though these analyses may be in and 
of themselves, it is not legitimate to use the regression coefficients from a 
second-level analysis (say, of the level-of -vocabulary scores that account for 
about 40 per cent of the variance of level of comprehension) with the first- 
level analysis of the level-of -comprehension criterion variable directly to 
show what proportion of the latter is accounted for by some of the original 
21 predictor variables for which regression coefficients were not computed in 
the original analysis. If you want such data (and Holmes seemed to want them), 
you should use a large number of subjects and compute the partial regression 
coefficients for all of the variables (which can be done on large computers in 
a small fraction of a second). I have discussed this matter at some length 
(Davis, 1971, pp. 8-22 to 8-24) and Carroll (1968) earlier drew attention to it, 
diagrammed the situation, and took the same dim view of Holmes’s substrata 
analysis technique that I take, and for the same reasons as nearly as I can tell. 
In fact, if you haven’t read that article by Carroll, which was a review of a 
monograph entitled Speed and Power of Comprehension in Reading by Holmes and 
Singer (1966) , it will pay you to locate volume 2 of the journal called Research 
on the Teaching of English in which the review appeared. 

I have pointed out (Davis, 1971, pp. 8-23 to 8-24) that the independent 
contributions to criterion-score variance identified mathematically by a multiple- 
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regression analysis cannot usually be properly identified simply by assigning to 
each of these independent parts of the criterion-score variance the name of one 
of the predictor variables,) Each packet of predictor-score variance that is 
said to "account for" a portion of the criterion-score variance usually represents 
only one of several different elements that together make up the variance of 
the predictor score 0 The beta weight associated with a predictor score usually 

reflects the presence of one element of criterion-score variance in more than 
one predictor test 8 To make inferences about the psychological nature of the 
elements of the variance of predictor tests that make significant contributions 
to the predictable variance of a criterion variable is a complex and delicate 
process that requires intimate knowledge of the skills involved in each 
predictor va end insightful understanding of what is yielded by multiple- 
regression . The interpretation of the results of factor analyses and 

component analyses is equally complex and difficult. My own feeling is that, 
broadly speaking, the statistical technique of substrata analysis that was 
suggested by Holmes (1948, 1954) does not properly lead to the identification of 
substrata factors as he envisaged that it would. Furthermore, the interpretations 
made by Holmes of data obtained in the course of his substrata analyses of the 
nature of the reading process are not statistically sound and may have led to 
misleading conclusions. Filially, his investigations cannot be said to have 
either supported or to have denied support to his basic substrata theory of 
reading. 

The second large-scale regression analysis of comprehension was made by 
Singer (1965a, 1965b), The plan of the study was similar to that of Holmes's 
(1948) study, but the variables were somewhat different and were administered 
to about 250 pupils in each of grades 3, 4, 5, and 6 in Alvord, California. 
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The two criterion variables were speed of comprehension, as measured by the 
speed-of -reading subtest in the Gates Reading Survey (Gates, 1953, 1958), 
and level of comprehension, as measured by the subtest of that name in the 
Gates Reading Survey# 

After discussing the contributions of specific tests to the variance 
of the speed-of-comprehension criterion scores, Singer concludes that these 
skills shift from a predominance of visual -perception abilities at the 
third-grade level to a more equal balance with knowledge of word meanings 
at the sixth-grade level. This may, however, reflect an increasing speed 
component from grade to grade in the vocabulary test used* 

The largest proportion of the variance of the level-of -comprehension 
criterion variance was accounted for by scores on the same vocabulary test# 
Without detailed personal knowledge of the elements measured by other 
predictor variables, it is difficult to determine the nature of the criterion 
variance predicted by the three or four tests with fairly large regression 
coefficients # 

A third large-scale regression study was reported by Holmes and Singer 
(1966). They used a sample of 400 pupils in grades 9-12 of the University 
of California Demonstration Summer School of 1953. The procedures used 
in selecting the sample and the wide age and grade range in it raise serious 
problems about generalizing the findings of the study to any representative 
group of American secondary-school pupils. 

As in the two previous studies reported by Holmes (1948, 1954) and Singer 
(1965a, 1965b), measures of speed of comprehension and level of comprehension 

f 

served as criterion variables. Careful examination of these measures leaves 
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doubt about their validity. In fact, the names given to some of the predictor 
tests may be misleading. For example, test 8 is labelled "Vocabulary in Context," 
Yet the sample item reads: 

He felt very sad . 

1 timid 

2 happy 

3 weary 

4 sorrowful 

5 hungry 

The so-called context provided for the word "sad" is entirely superfluous; 
the item can be answered with complete confidence by an examinee who knows that 
the word "sad" means "sorrowful," So this test measures knowledge of words 
presented in isolation, as do the sample items for Tests 1 and 9, although 
these tests are labelled "Visual Verbal Meaning Test" and "Vocabulary in Isolation 
Test," respectively. It should be noted that Test 1 was speeded (4 minutes 
allowed for 50 four-choice items) while Tests 8 and 9 were essentially unspeeded. 

The variance of the speed-of-comprehension criterion variable has fairly 
large components that may, perhaps, be regarded as: 

1, Knowledge of word meanings; 

2, Reasoning facility; 

3, Visual .u&saiory for word forms; 

4, General Information; 

5, Interest in literary rather than computational interests. 

The variance of the level-of -comprehension criterion variable has fairly 
large components that may, perhaps, be regarded as: 

Knowledge of word meanings; 

Reasoning facility; 
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3* General Information; 

4. Interest In literary rather than mechanical activities. 

These Interpretations of the data differ markedly from those given by 
Holmes and Singer (1966, pp. 62-78), whose interpretations seem to be based 
on an Impression that each of their predictor tests Is made up of homogeneous 
variance that is adequately described by the title of the test. 

In addition to the regression analyses reported above, Holmes and Singer 
also performed a centroid factor analysis of the common variance of the two 
criterion variables and the 54 predictor variables. Nine factors were extracted 
and rotated by the normalized varlmax procedure. Of these, six measure variables 
essentially Irrelevant to either of the two criterion variables. This result Is 
a little surprising when it Is recalled that these 54 predictor variables were 
chosen in the light of the substrata-factor theory to be relevant to an analysis 
of speed and power In reading. The rotated factor that accounted for the largest 
percentage of the variance (27 per cent) appears to measure general verbal facility. 
The second largest factor appears to measure the reporting of problems In personal 
adjustment and Is not relevant to the criterion measures of reading. The third 
largest factor seems, after reflection of signs, to be a music aptitude and 
appreciation variable. It has little relationship to reading. The fourth largest 
factor appears, after reflection of signs, to measure visual-perception ability. 

Hie fifth largest factor probably measures speed of mental operation, especially 
In taking tests. The remaining factors have no appreciable relationship to speed 
or level of comprehension In reading. Three of them appear to be interest factors 
and the fourth Is not defined clearly enough to warrant interpretation. To 
conclude from this study that comprehension In reading Involves general verbal 
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facility, speed of mental operation, and some skill in visual-perception is 
probably correct but does not seem particularly helpful in understanding the 
processes of comprehension or of developing skill in them* It would appear that 
this effort to test certain aspects of the substrata theory of reading was not 
rewarding, perhaps because of an unfortunate selection of criterion and predictor 
tests and partly because of statistical procedures* 

Let me complete this paper by saying that after a half century of 
psychometric research on comprehension I believe that Professor Thorndike*s 
1917 model of comprehension as mainly a composite of recalling word meanings 
and reasoning with them has stood the test of time. Davis's studies (1941, 

1968) confirmed Thorndike's findings, quantified them, and extended their 
application to secondary-school and college students* These studies also 
added several more specific skills that should be taken into account* 

How should we use these findings and what research ought to come next? 

First, tests of comprehension for secondary schools and colleges should be 
referenced to Davis's list of behavioral skills shown in Table 2, This is a 
fundamental step in achieving content validity through the development of 
criterion-referenced tests at appropriate levels of difficulty. 

Second, workbooks designed to develop the behavioral skills of comprehension 
should be developed and used to allow pupils consciour.ly to practice chese 
skills in useful content materials* Sample exercises of the type that might be 
useful in such workbooks are shown in Table 3. Third, tests should be supplied 
with the workbooks to aid in evaluating their usefulness and to permit informal 
diagnosis of the performance of individual pupils* 

The next steps in research in comprehension should, in my judgment, be 
to apply Davis's model of the process to elementary-school pupils in grades 2-3 
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and to middle-school pupils in grades 4 through 6. Adaptations in the model 
will have to be made and new types of comprehension exercises will have to 
be devised. At least one doctoral dissertation is now under way in this 
field. Translation of the model into the language and concepts of psycho- 
linguistic theory has already been accomplished. 

Along with the extension of Davis's model of comprehension to use in 
grades 2-6, a new model of the process of decoding must be developed. A 
preliminary version of this model has already been developed, criterion- 
referenced tests based on it have been written, and these tests are being 
tried out in Title -I classes this week under the auspices of the Chicago 
Board of Education and the Educational Records Bureau* 
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